Analysis of Integrated Data without Data Integration

نویسندگان

  • Alan F. Karr
  • Xiaodong Lin
  • Ashish P. Sanil
  • Jerome P. Reiter
چکیده

M scientific and policy investigations require statistical analyses that “integrate” data stored in multiple, distributed databases. For example, a regression analysis on integrated state databases about factors influencing student performance would be more insightful than individual analyses, or complementary to them. Other contexts where the same need arises range from homeland security to environmental monitoring. At the same time, the barriers to actually integrating the databases are numerous. One is confidentiality: the database holders—we term them “agencies”—almost always wish to protect the identities of their data subjects. Another is regulation: the agencies may be forbidden by law to share their data, either with each other or with a trusted third party. A third is scale: despite advances in networking technology, the only way to move a terabyte of data from point A today to point B tomorrow is FedEx. The good news is that for many analyses it is not necessary to move the data. Instead, using techniques from computer science known generically as secure multiparty computation, the agencies can share summaries of the data anonymously, but in a way that the analysis can be performed in a statistically principled manner. In this article we illustrate linear regression on “horizontally partitioned data.” Only one concept is needed, that of secure summation, which is shown in Figure 1. There are other approaches to this problem for lower risk situations, as How can secure multiparty computation enable agencies to share information without sacrificing confidentiality?

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integrated Environmental Analysis using GIS for Rational Planning of Conservatory Management of Slopes Application in the Ouergha Basin (Morocco)

The objective of this work is the realization of a map spatialising proposals of management and planning of lands, with a view to their rational management within the framework of a sustainable development. It was based on a diagnosis of the natural environment that allowed the analysis and identification of constraints to the development of the watershed of Ouergha (North of MOROCCO). The meth...

متن کامل

Secondary Use of Laboratory data: Potentialities and Limitations

Clinical databases have been developed in recent years especially during the course of all medical concerns including laboratory results. The information produced by the diagnostic laboratories have great impact on health care system with various secondary uses. These uses are sometimes as publishing new extracted information of laboratory reports which have been widely applied in the scientifi...

متن کامل

A Fully Integrated Method for Dynamic Rock Type Characterization Development in One of Iranian Off-Shore Oil Reservoir

Rock selection in modeling and simulation studies is usually based on two techniques; routinely defined rock types and those defined by special core analysis (SCAL). The challenge in utilizing these two techniques is that they are frequently assumed to be the same, but in practice, static rock-types (routinely defined) are not always representative of dynamic rock-types (SCAL defined) in the re...

متن کامل

GPS/INS Integration for Vehicle Navigation based on INS Error Analysis in Kalman Filtering

The Global Positioning System (GPS) and an Inertial Navigation System (INS) are two basic navigation systems. Due to their complementary characters in many aspects, a GPS/INS integrated navigation system has been a hot research topic in the recent decade. The Micro Electrical Mechanical Sensors (MEMS) successfully solved the problems of price, size and weight with the traditional INS. Therefore...

متن کامل

Program evaluation of an integrated basic science medical curriculum in Shiraz Medical School, using CIPP evaluation model

Introduction: In recent years curriculum reform and integrationwas done in many medical schools. The integrated curriculum isa popular concept all over the world. In Shiraz medical school,the reform was initiated by stablishing the horizontal basicscience integration model and Early Clinical Exposure (ECE)for undergraduate medical education. The purpose of this studywas to provide the required ...

متن کامل

Optimization of Energy Consumption in Milk Production Units through Integration of Data Envelopment Analysis Approach and Sensitivity Analysis

The aims of this study were to evaluate the energy consumption and its modeling in industrial milk production units using data envelopment analysis (DEA) approach and sensitivity analysis. Data were collected from 44 industrial milk production units in Guilan province of Iran with face to face questionnaire method during 2012-2013. Inputs included animal feed, fossil fuels, electricity, machine...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004